Llama 2 - work4ai

Llama 2

https://arxiv.org/abs/2307.09288Open Foundation and Fine-Tuned Chat Models

https://ai.meta.com/llama/#partnershipsIntroducing Llama 2

https://ai.meta.com/resources/models-and-libraries/llama-downloads/モデル

https://github.com/facebookresearch/llama/tree/mainfacebookresearch/llama

https://github.com/facebookresearch/llama-recipes/Llama 2 Fine-tuning / Inference Recipes and Examples

https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/Llama 2: Open Foundation and Fine-Tuned Chat Models

https://gyazo.com/a7ef82a1319c5a12707df7ed5d4e0fbd

モデル：https://huggingface.co/meta-llama

https://huggingface.co/meta-llama/Llama-2-7b Llama-2-7b

https://huggingface.co/meta-llama/Llama-2-13b Llama-2-13b

https://huggingface.co/meta-llama/Llama-2-70b Llama-2-70b

量子化：https://huggingface.co/TheBloke

@PhysConsultant: LLaMaは基本的なtransformerに↓の3つの工夫施したもの。

・pre-normalization using RMSNorm

・SwiGLU activation function

・rotary positional embeddings

それで、LLaMa2はさらに下記の変更を加えた。

・Context Lengthを2倍。

・↑でメモリ空間が巨大になるためGrouped-Query Attentionを採用。

Llama.cpp で Llama 2 を試す｜npaka

ライセンス

Llama 2のライセンスをしっかり読む